Graph Partitioning Induced Phase Transitions 
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We study the percolation properties of graph partitioning on random regular graphs with N ver- 
tices of degree k. Optimal graph partitioning is directly related to optimal attack and immunization 
of complex networks. We find that for any partitioning process (even if non-optimal) that partitions 
the graph into equal sized connected components (clusters), the system undergoes a percolation 
phase transition at f = fc — 1 — 2/k where / is the fraction of edges removed to partition the graph. 
For optimal partitioning, at the percolation threshold, we find 5* ~ N°-'^ where 5* is the size of the 
clusters and £ ~ A*'"'^^ where £ is their diameter. Additionally, we find that S undergoes multiple 
non-percolation transitions for / < fc. 

PACS numbers: 



The graph partitioning problem deals with assigning 
vertices in a graph to different partitions such that no 
partition is greater than a given size. The optimal solu- 
tion is one which minimizes the fraction of edges / that 
must be removed such that there are no edges between 
partitions [1]. 

Graph partitioning is of interest not only because of 
the large amount of previous research done but also be- 
cause optimal partitioning is equivalent to optimal at- 
tack/immunization of a complex network. That is, the 
percolation threshold fc, at which global connectivity is 
lost, will be lower than that for any other type of at- 
tack/immunization and the measure of fragmentation F 
for all values of / will be higher than for any other 
type of attack/immunization Q. 

Graph partitioning is a much studied subject with a 
long history of work by mathematicians and computer 
scientists. The problem of determining the optimal solu- 
tion is NP complete. Mathematicians have pursued find- 
ing rigorous bounds for the minimum number of edges 
needed to partition (usually random regular) graphs into 
two equal sized partitions [1, H, 0, 0] ■ Computer scien- 
tists have pursued developing efficient algorithms which 
heuristically find good approximations to the optimal so- 
lution [ESSU^. 

Here we study graph partitioning from the standpoint 
of statistic al p hysics. To make contact with percolation 
theory [ll|, [ij] , we identify the number of edges removed 
as the control variable and study the inverse problem: 
given that we are allowed to remove a fraction / of the 
edges from the graph, how can we partition the graph to 
minimize the size of the largest partition. We denote as 
S the size of the largest connected component (cluster) 
which results from the partitioning. Then, S plays the 
role of order parameter and we are interested in the be- 
havior of S as a function of /. We ask if there is a critical 



value fc such that ioi- f < fc, S ^ N while for f > fc, 
S scales slower than 0{N). That is, does the graph un- 
dergo a percolation phase transition? If so, what is the 
percolation threshold fc and what are the critical expo- 
nents associated with the phase transition. 

We study random k — regular graphs, random graphs 
the vertices of which all have the same degree, fc. We 
study these graphs because of their intrinsic interest and 
because these graphs are examples of expander graphs 
which are extremely robust to node or edge removal |13l . 
m. They are therefore a good testbed for optimal graph 
partitioning. 

We find that, in fact, a percolation transition does exist 
and we analytically determine fc- We also estimate criti- 
cal exponents associated with the transition. In addition 
however, we find that for f < fc the graph undergoes 
a large number of first order transitions related to the 
partitioning process. 

Percolation Threshold. The percolation threshold can 
be determined analytically as follows. In Refs. [H, [lB| 
it was argued that for a random graph having a degree 
distribution P{k) to have a spanning cluster, a vertex j 
which is reached by following a link (from vertex i on) the 
giant cluster must have at least one other link, on average 
to allow the cluster to exist. Or, given that vertex i is 
connected to j, the average degree of vertex j must be at 
least 2: 



< h 



j >= 2. 



(1) 



We will show below that, for large N at the percolation 
threshold, all partitions are essentially the same size and 
that each partition consists of one cluster |l3|. Then, to 
achieve Eq. ^ the average degree in each cluster must 
be 2 and Pc the fraction of edges which must be present 
is 



Pc 



l-fc 



(2) 
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This is to be compared to the random site or bond per- 
colation threshold pc — l/(fc — 1) jl5] . 



We can gain insight into the structure of the spanning 
clusters by noting that for tree graphs with n vertices 
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< fc > = 



2{n-l) 



(3) 



which approaches 2 as n — > oo. For finite graphs, how- 
ever, to satisfy < k >= 2, there must be on average one 
loop in each graph. Thus, at the percolation threshold, 
the clusters contain on average one loop. Our problem 
can be restated as: how do we partition a graph into the 
largest number of equal sized partitions each composed 
of one cluster with on average one loop per cluster. The 
larger the number of partitions (and thus the smaller the 
partition size), the closer the solution is to the optimal 
one. Different types of partitioning that maintain one 
cluster per partition will result in the same critical point 
but the scaling of the cluster size at the critical point 
may depend on the optimality of the partitioning. 

Optimal Partitioning. We use the METIS graph parti- 
tioning program ^] which provides close to optimal graph 
partitioning. For the same random graph we run the 
program many times over the range of partition sizes in 
which we are interested. After each partitioning we iden- 
tify the clusters in the graph, determine the size of the 
largest cluster and note the number of edges needed to 
be removed for the partitioning. For each value of the 
number of edges, we maintain the minimum value of the 
size of largest cluster in the partitioning. 

Figure [1] illustrates the behavior of s = S/N versus / 
for various values of k [l^ . In what follows we will ana- 
lyze the case fc = 3 in depth; similar results are obtained 
for other values of fc. 

In Fig. H for TV = 10^ we plot P{S) the distribution 
of cluster sizes, S, versus S at the threshold predicted by 
Eq. (HI /c = 1/3. As expected, the distribution is very 
strongly peaked - almost all clusters are the same size. 
In the inset in Fig. [1] for fc = 3 and various values of N 
we plot s versus /. Below fc the plots collapse indicating 
that here S ^ N . In the vicinity of and above fc the 
plots no longer collapse, a manifestation of S scaling more 
slowly than N . 

In Fig. [3] we plot Sc the value of S at the percolation 
threshold versus N . The slope of the plot is consistent 
with 



Sc^N^ 



(4) 



where a: ~ 0.4. In Fig.lH we plot Sc versus N for various 
values of / and see that the straightest plot is for fc — 
1/3, the predicted critical threshold. 

In Fig. [3] we also plot i the chemical size (diameter) 
of the critical clusters versus TV. The slope of the plot is 
consistent with 

ir^N' (5) 

where z « 0.25. From Eqs. ([4]) and ([5]) we obtain 

Sc ~ i"-' (6) 
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FIG. 1: Normalized largest cluster size, s = S/N versus frac- 
tion of edges removed, /, for random regular graphs with 
number of vertices A'' = 10* of degree (from left to right) 
fc = 3, 6, 10, and 20. The vertical lines at the x-axis mark 
the predicted values of /c = 1 — 2/k from left to right 
for k = 3, 6, 10, and 20. The dashed horizontal lines at 
s = 1/2, 1/3, 1/4, and 1/5 are the values of s for which the 
first few non-percolation transitions take place. Inset: For 
(from top to bottom on rig ht) iV = 10", 3 X 10", and lO"^ and 
fc = 3, s versus /. Data collapse until / is in the vicinity of 
fc = 1/3 (indicated by vertical line). 
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FIG. 2: For iV = lO'* and fc = 3 at criticality, P{S), the 
distribution of cluster sizes, S. Inset is plot of P{Sb), the 
distribution of blob sizes, Sb, for iV = 10'' and fc = 3. 



where di = x/z « 1.6. The exponent di is a measure of 
the compactness of the clusters: clusters with di — 1 are 
essentially chains; higher values of di correspond to more 
dense structures. For random percolation, di = 2 [12, lj|. 
The inset in Figure [5] is a representative critical cluster 
obtained from partitioning. Note the single loop required 
by Eq. ([1]) and its "stringy" structure, the manifestation 
of di w 1.6. In Fig. [5] we plot the distribution of the 
number of loops per cluster, P{nioop) and note that it 
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FIG. 3: Largest cluster size at criticality, Sc (squares), chem- 
ical size of largest cluster, £ (circles), and most probable blob 
size Sb (triangles), versus number of vertices A'^ in graph. 

is fairly narrow with the most probable value being 1. 
Thus, not only is the average number of loops per cluster 
1 but the most probable number is also 1. 
The exponent v is defined by 

r ~ r. (7) 

where r is Euclidean distance. At the percolation thresh- 
old, i/ is expected to be 1/2, the same value as for a 
random walk (or for a network embedded in a very high 
dimensional lattice, such that spatial constraints are ir- 
relevant) jia] . 

Using Eq. (O with di = 1.6 and Eq. ^ with £> = 1/2, 
we can determine the fractal dimension of the percolation 
clusters at criticality defined by 

Sc - (8) 

to be 

df^^^ 3.2. (9) 
v 

Assuming that our problem of optimal partitioning on 
random regular graphs has an analog on lattices in Eu- 
clidean space of dimension d, in which 

N ^r'^ (10) 

where r is the length of a side of the lattice, we can 
determine the upper critical dimension, dc for that ana- 
log. The upper critical dimension is defined such that 
for d > dc, all critical exponents are unchanged. Since 
random graphs can be considered to be embedded in an 
infinite dimensional space, the critical exponents for our 
problem should be the same as those at the critical di- 
mension for the Euclidean analog. Using Eqs. ([5]) and 
(Uni), we find Sc ^ N'^f^'^" ~ iV"-4 and thus 4 = 8 which 



interestingly is the critical dimension for lattice animals 
and branched polymers [l^, [2l| . 

We can learn more about the fractal structure of the 
spanning cluster at fc by analyzing the 2-connected com- 
ponents (blobs) ^] within the spanning clusters. This 
is equivalent to analyzing the loops within the spanning 
clusters because the typical cluster contains 1 loop which 
is the 2-connected component in the cluster. In Fig. [3] 
we plot the most probable blob size (equivalent to the 
length of loops) , Sg , versus A''. The scaling is consistent 
with Sg ^ similar to the scaling of the chemical 

length of the whole cluster. From this we infer that the 
chemical size of the cluster is driven by the size of the 
loops. 
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FIG. 4: Largest cluster size for (from top to bottom) values of 
/ = 0.331, 0.332, 0.333, 1/3 (solid line), 0.335, 0.337, and 0.34, 
versus number of vertices N in graph. The straightest plot is 
for / = 1/3 the predicted value of fc 

Non-optimal partitioning. We find that for partition- 
ing in which we ensure that each partition consists of one 
cluster but no attempt is made to minimize the number 
of edges between partitions, as predicted above, fc in this 
case is also 1 — 2/k but at criticafity S ^ N^^^. That is, 
the clusters at criticality are larger than those at criti- 
cality for optimal partitioning. The argument that the 
exponent is exactly 1/2 is as follows: We ask how large 
a cluster must be to have on average one loop. Con- 
sider a cluster of size S. The total number of edges as- 
sociated with vertices in the cluster is kS. Connectivity 
among vertices in the cluster is provided by 2(5* — 1) of 
the edges and others (also of order S) are either removed 
(connected to other partitions) or connected back to the 
cluster forming a loop. Because the graph is random and 
we partition randomly (subject to the constraint that the 
partitions consist of one cluster each), the probability 
that one of these edges is connected back to the cluster 
is 

Ploop S^. (11) 
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FIG. 5: At criticality for = 10^ and A: = 3, P{nioop) dis- 
tribution of number of loops per cluster, nioop- Inset: For 
N = 10^ and = 3, typical cluster at criticality illustrating 
that typical clusters at criticality contain 1 loop decorated by 
trees. 

Setting Pioop = 1 we find 5 ~ N^/^. 

Random partitioning. Random partitioning is achieved 
by assigning vertices to partitions randomly and is equiv- 
alent to random site percolation [23|, for which the well 
known result fc = 1 — l/(fc— 1) holds In contrast 

to the optimal and the non-optimal partitioning consid- 
ered above, for random partitioning, partitions contain 
clusters of all sizes (including very small ones). Eq. ([2]) 
holds for the spanning cluster in each partition but does 
not hold for all clusters and fc is therefore significantly 
larger. 

Non-percolation transitions. In Fig. [U we see that 
the order parameter is discontinuous at values of s = 
1/2, 1/3, . . ., qualifying these points as first order phase 



transitions. However, these discontinuities, which occur 
where the number of partitions changes are not percola- 
tion transitions - the scaling of s with TV does not change. 
The behavior at these transitions (and the general shape 
of the segments of the plots) can be understood as follows: 
Consider the region of the plot corresponding to two par- 
titions (1/2 < s < 1) and assume we reduce the size of 
the larger partition (increasing the size of the smaller 
partition) by moving selected vertices one-by-one from 
the larger partition to the smaller partition [23|. Ini- 
tially, the number of edges needed to be removed when 
we move a vertex is A: - all edges adjacent to the moved 
vertex must be removed. As the size of the smaller par- 
tition increases, we can select a vertex requiring fewer of 
its edges to be removed because some of its edges already 
have ends in the smaller partition. At some point, the 
number of edges to the smaller partition of a vertex to 
be moved is equal to the number of the vertex's edges to 
the larger partition - thus, there is zero cost to the move 
[2^ . This continues to be the case until the partitions 
are of equal size, resulting in the discontinuity. 

Discussion. If a graph with an arbitrary degree dis- 
tribution P{k) can be partitioned such that there is one 
cluster per partition, then our result for fc should be 
generalized to 

where < fc > is the average degree per vertex. Areas 
for future work include determining whether this is the 
case for partitioning on such other types of graphs as 
Erdos-Renyi and scale- free graphs. Also of interest will 
be determining if there exists a Euclidean analog to our 
graph partitioning problem. 
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